Intrinsic Evaluation of Text Mining Tools May Not Predict Performance on Realistic Tasks
نویسندگان
چکیده
Biomedical text mining and other automated techniques are beginning to achieve performance which suggests that they could be applied to aid database curators. However, few studies have evaluated how these systems might work in practice. In this article we focus on the problem of annotating mutations in Protein Data Bank (PDB) entries, and evaluate the relationship between performance of two automated techniques, a text-mining-based approach (MutationFinder) and an alignment-based approach, in intrinsic versus extrinsic evaluations. We find that high performance on gold standard data (an intrinsic evaluation) does not necessarily translate to high performance for database annotation (an extrinsic evaluation). We show that this is in part a result of lack of access to the full text of journal articles, which appears to be critical for comprehensive database annotation by text mining. Additionally, we evaluate the accuracy and completeness of manually annotated mutation data in the PDB, and find that it is far from perfect. We conclude that currently the most cost-effective and reliable approach for database annotation might incorporate manual and automatic annotation methods.
منابع مشابه
Intrinsic Evaluation of Word Vectors Fails to Predict Extrinsic Performance
The quality of word representations is frequently assessed using correlation with human judgements of word similarity. Here, we question whether such intrinsic evaluation can predict the merits of the representations for downstream tasks. We study the correlation between results on ten word similarity benchmarks and tagger performance on three standard sequence labeling tasks using a variety of...
متن کاملMutation extraction tools can be combined for robust recognition of genetic variants in the literature
As the cost of genomic sequencing continues to fall, the amount of data being collected and studied for the purpose of understanding the genetic basis of disease is increasing dramatically. Much of the source information relevant to such efforts is available only from unstructured sources such as the scientific literature, and significant resources are expended in manually curating and structur...
متن کاملThe relationship between parents' rating and performance-based measure of executive function in preschool children
Introduction: Both performance-based and rating measures are used to evaluate preschool children’s executive functions. This study aimed to investigate the relationship between performance-based tasks and parental rating of executive functions in preschool children. Method: The present study was a descriptive correlational study. The current study population consisted of all 4 and 5-years-old p...
متن کاملTask Type and Prompt Effect on Test Performance: A Focus on IELTS Academic Writing Tasks
Recent versions of international high-stakes tests like TOEFL and IELTS have made use of integrated tasks in addition to the traditional independent tasks in a claim to provide a more realistic estimation of the test takers’ language abilities. The present study aimed to investigate how test takers’ performance may differ on such tasks. As such, the test takers’ performance was compared on IELT...
متن کاملTopic Modeling and Classification of Cyberspace Papers Using Text Mining
The global cyberspace networks provide individuals with platforms to can interact, exchange ideas, share information, provide social support, conduct business, create artistic media, play games, engage in political discussions, and many more. The term cyberspace has become a conventional means to describe anything associated with the Internet and the diverse Internet culture. In fact, cyberspac...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- Pacific Symposium on Biocomputing. Pacific Symposium on Biocomputing
دوره شماره
صفحات -
تاریخ انتشار 2008